Frequent Itemsets Mining for Database Auto-Administration

نویسندگان

  • Kamel Aouiche
  • Jérôme Darmont
  • Le Gruenwald
چکیده

With the wide development of databases in general and data warehouses in particular, it is important to reduce the tasks that a database administrator must perform manually. The aim of auto-administrative systems is to administrate and adapt themselves automatically without loss (or even with a gain) in performance. The idea of using data mining techniques to extract useful knowledge for administration from the data themselves has existed for some years. However, little research has been achieved. This idea nevertheless remains a very promising approach, notably in the field of data warehousing, where queries are very heterogeneous and cannot be interpreted easily. The aim of this study is to search for a way of extracting useful knowledge from stored data themselves to automatically apply performance optimization techniques, and more particularly indexing techniques. We have designed a tool that extracts frequent itemsets from a given workload to compute an index configuration that helps optimizing data access time. The experiments we performed showed that the index configurations generated by our tool allowed performance gains of 15% to 25% on a test database and a test data warehouse.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data sanitization in association rule mining based on impact factor

Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...

متن کامل

روشی کارا برای کاوش مجموعه اقلام پرتکرار در تحلیل داده‌های سبد خرید

Discovery of hidden and valuable knowledge from large data warehouses is an important research area and has attracted the attention of many researchers in recent years. Most of Association Rule Mining (ARM) algorithms start by searching for frequent itemsets by scanning the whole database repeatedly and enumerating the occurrences of each candidate itemset. In data mining problems, the size of ...

متن کامل

Review on Matrix Based Efficient Apriori Algorithm

www.ijitam.org Abstract These Apriori Algorithm is one of the wellknown and most widely used algorithm in the field of data mining. Apriori algorithm is association rule mining algorithm which is used to find frequent itemsets from the transactions in the database. The association rules are then generated from these frequent itemsets. The frequent itemset mining algorithms discover the frequent...

متن کامل

Maximal Frequent Itemsets Mining Using Database Encoding

Frequent itemsets mining is a classic problem in data mining and plays an important role in data mining research for over a decade. However, the mining of the all frequent itemsets will lead to a massive number of itemsets. Fortunately, this problem can be reduced to the mining of maximal frequent itemsets. In this paper, we propose a new method for mining maximal frequent itemsets. Our method ...

متن کامل

An Efficient Algorithm for Maintaining Frequent Closed Itemsets over Data Stream

Data mining refers to the process of revealing unknown and potentially useful information from a large database. Frequent itemsets mining is one of the foundational problems in data mining, which is to discover the set of products that purchased frequently together by customers from a transaction database. However, there may be a large number of patterns generated from database, and many of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003